Modularity from Fluctuations in Random Graphs and Complex Networks 
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The mechanisms by which modularity emerges in complex networks are not well understood but recent re- 
ports have suggested that modularity may arise from evolutionary selection. We show that finding the modularity 
of a network is analogous to finding the ground-state energy of a spin system. Moreover, we demonstrate that, 
due to fluctuations, stochastic network models give rise to modular networks. Specifically, we show both nu- 
merically and analytically that random graphs and scale-free networks have modularity. We argue that this fact 
must be taken into consideration to define statistically-significant modularity in complex networks. 
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Statistical, mathematical, and model-based analysis of 
complex networks have recently uncovered interesting unify- 
ing patterns in networks from seemingly unrelated disciplines 
[l^ 2i! 3; A, 5J. In spite of these advances, many properties 
of complex networks remain elusive, a prominent one being 
modularity f^,*?]. For example, it is a matter of common ex- 
perience that social networks have communities of highly in- 
terconnected nodes that are poorly connected to nodes in other 
communities. Such modular structures have been reported not 
only in social networks |6, 7, 8], but also in biochemical net- 
works 1 9], food webs ilOil . and the Internet It is widely 
believed that the modular structure of complex networks plays 
a critical role in their functionality |9]. There is therefore a 
clear need to develop algorithms to identify modules accu- 
rately Riiniiiiiii. 

More fundamentally, the mechanisms by which modular- 
ity emerges in complex networks are not well understood. 
In biological networks — both biochemical and ecological — 
researchers have suggested that modularity increases robust- 
ness, flexibility, and stability jot ildl . Similarly, in engineered 
networks, it has been suggested that modularity is effective to 
achieve adaptability in rapidly changing environments |14]. 
It may therefore seem that evolutionary pressures make net- 
works modular, implying that any successful model of com- 
plex networks should take into account external factors that 
enhance modularity. Recently, however. Sole and Fernandez 
have pointed out that models without any external pressure are 
able to give rise to modular networks 1 15]. 

In this Letter, we show that Erdos-Renyi (ER) random 
graphs, in which any pair of nodes is connected with prob- 
ability p (3, have a high modularity. We show numerically 
and analytically that this high modularity is due to fluctuations 
in the establishment of links, which are magnified by the large 
number of ways in which a network can be partitioned into 
modules. Furthermore, we show that one obtains similar re- 
sults when considering scale-free networks |2]. We conclude 
by discussing how these results should be taken into consider- 
ation to define statistically significant modularity in complex 
networks. 

Following the first quantitative definition of modularity 
several groups have proposed heuristic algorithms to 



detect modules in complex networks. For a given partition of 
the nodes of a network into modules, the modularity Ai of 
this partition is defined as Q] 
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where r is the number of modules, L is the number of links in 
the network, Ig is the number of links between nodes in mod- 
ule s, and ds is the sum of the degrees of the nodes in module 
s. This definition of modularity implies that Ai < 1 and that 
= for a random partition of the nodes fl\. We define 
the modularity M of a network as the largest modularity of all 
possible partitions of the network M = ma.x{A4}. 

The problem of finding the modularity of a network with 
S nodes is therefore analogous to the standard statistical me- 
chanics problem of finding the ground-state energy of the 
Hamiltonian H = —LAi. Specifically, one can map the 
network into a spin system by defining the variables Si E 
{1, 2, . . . , 5} as the module to which node i belongs and the 
couplings Jij as being 1 if nodes i and j are connected in the 
network and otherwise. Then, from Eq. {I), one can demon- 
strate that 
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This Hamiltonian corresponds to an 5-state Potts model with 
both ferromagnetic and anti-ferromagnetic terms, and two-, 
three-, and four-spin interactions. Therefore, it seems difficult 
to apply methods used in problems that are similar but for- 
mally simpler, like the graph coloring problem iTitIi . Rather, 
we propose here a heuristic estimation of the modularity for a 
number of interesting graph models, namely low-dimensional 
regular lattices, ER random graphs |il6] and scale-free net- 
works |2|. 

Low- dimensional regular lattices — Consider a one- 
dimensional lattice with S nodes, each one connected to its 
two neighbors lEoll . This case is particularly simple because 
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the modules comprise only contiguous nodes and, therefore, 
the number of between-module links equals the number r of 
modules. Assuming that all modules have approximately the 
same size n = S/r, the modularity of a partition with r mod- 
ules is 



MiDiS;r) = 
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where we have used the fact that the number L of links is 
L K, S. Under these assumptions, the problem of finding the 
modularity of a regular one-dimensional lattice is reduced to 
finding the optimal number r* of modules, that is, the num- 
ber of modules that yields the maximum modularity. One can 
show that r* {S) — VS, and the modularity is 
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Note that the only assumption in the calculation is that all 
modules have approximately the same number of nodes. Nu- 
merical results confirm that this is a sensible assumption. 

One can generalize this result to one-dimensional lattices 
in which each node is connected to z nodes on the left and 
z on the right. In this case, the leading contributions to the 
modularity are 



MiDiS,z) = 1- 



2(2 + 1) 



S 



(5) 



Similarly, one can calculate the modularity of d-dimensional 
cubic lattices in which each node is connected to 2z nodes in 
each one of the d directions, to obtain that fisll 



MdoiS, z) = 1- (d+1) 



z + 1 
2d 



1 



5'3+T 



(6) 



Random graphs — In ER random graphs 1 16|, each pair of 
nodes is connected with probability p. As for d-dimensional 
lattices, we assume that the partition of the network with high- 
est modularity consists of r modules with approximately the 
same number of nodes n — S/r, the same number of within- 
module links ki, and the same number of links ko to other 
modules. In the S* 1 limit, we can assume that the total 
number of links is S'^p/2 and, therefore, ki and ko are related 
by 
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Hence, for 5^1, the modularity of such a partition is simply 
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Under these assumptions, the problem of finding the modu- 
larity of a random graph is reduced to finding a partition of the 
graph with the following properties: (i) The partition consists 
of r equal modules, each one with ki within-module links; (ii) 



The partition typically exists in a random graph; and (iii) The 
partition yields the maximum modularity relative to the other 
partitions that typically exist. 

In a random graph with S nodes and linking probability 
p, the average number M of different partitions with r iden- 
tical modules, each with ki links, is Af{S,p;r,ki). A cer- 
tain partition typically exists if Af{S,p; r, ki) > 1. Among all 
the partitions that typically exist, we are interested in the one 
whose modularity is maximum. In other words, given a cer- 
tain number r of modules, we want a partition with as many 
within-module links as possible. Therefore, if one finds a very 
common partition Af{S,p; r, ki) 3> 1, it must be possible to 
find another partition with the same r and k'^ > ki that has 
larger modularity. This new partition will be rarer than the 
former one J\f{S,p; r,k[) < M{S,p; r,ki). By iterating this 
argument, one concludes that the partition we are interested in 
must satisfy 



U{S,p;r,k*{S,p;r)) = l 
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where k*{S,p; r) is the maximum number of within-module 
links that one can typically find in a partition with r identical 
modules. 

To calculate J\f{S,p; r, ki), we use the following process. 
First, we calculate the number Afi of ways in which a module 
of size n = S/r, with ki within-module links and ko{r,ki) 
external links, can be separated from the rest of the graph: 
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where 



P,iS,p;n,x) ^ i I jp^a-p)—^ ^ (11) 

Po{S,p;n,x) = (^''(^~"))p-(l-p)"(^^-")--.(12) 

The next step is to separate the second module from the re- 
maining set of S* — n nodes. It is important to note that the 
second module only needs to establish ko{l — n/ {S — n)) ex- 
ternal links, because the remaining kgn/ {S — n) are already 
established with the first module. Therefore, 

M2 = P""") P^{S,p-n,k,)Po{S,p-n,ko{l - ^^)). 
\ n J b — n 

(13) 

Repeating this separation process, one can see that the general 
term is of the form 

M+i = (^'J'") P.{S,p;n,h)Po{S,p;n,ko(l-^^)). 

(14) 

Finally, M{S, p; r, ki) is the product of all the individual mod- 
ule separations 



U{S,p;r,k,,ko{r,h)) = l[Ut, 



(15) 
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the simple one-dimensional case, and the modularity is 



Linking probabiiity, p 
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FIG. 1 : Modularity in Erdos-Renyi random graphs, (a) Comparison 
of numerical results of the modularity as a function of the linking 
probability, and the predictions of Eqs. J16> and <19> . The numeri- 
cal results are obtained by maximizing the modularity, Eq. 0, using 
simulated annealing 1 19]. (b) Modularity as a function of pS for 
large networks, as predicted by Eq. <16t . Both in (a) and (b), nu- 
merical problems in the solution of Eq. ^5} prevent us from obtaining 
values of the modularity for larger values of p. 



so that Eq. (|9} can be solved numerically to obtain k* {S, p; r) 
using Eqs. ([TT}, CD- and ([B}. 

Once we find k*{S,p;r) for a given value of r, we use 
Eq. (|8} to obtain the modularity. Finally, we select the op- 
timal number of modules r — r*{S,p) and the modularity 
Mer[S, p) of the ER random graph is 



Mer{S,p) 



2r*{S,p)k*{S,p-r*) 1 

S'^p r*{S,p) 



(16) 



In Fig.fQa), we compare the modularity of ER graphs ob- 
tained through optimization of Eq. using simulated anneal- 
ing 1 19], with the predictions of Eq. il6\ . We find good agree- 
ment in the relevant region of sparse but connected graphs, 
that is, 2/5 < p < 1. 

Equation (I16> enables us to obtain the modularity of large 
random graphs, something that would not be possible using 
simulated annealing because of the computational cost. In 
Fig. Qb) we show that for 6*^00 the modularity only de- 
pends on pS 
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To obtain a closed expression for AI e r for any value of S, we 
note that at the percolation point pS = 2 the random graph 
contains essentially no loops, that is, the graph is a tree I1(tIi . 
In this case, one can find partitions in which the number of 
between-module links equals the number of modules r as in 



Mer{S,p = 2/S) = AhoiS) = 1 - 



(18) 



We propose the simplest ansatz that verifies Eqs. Ml\ and 
(II 8> simultaneously 



Mer{S,p) = 1-^ — 
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In Fig.[na), we show that Eq. M9\ is in good agreement with 
values obtained using simulated annealing. 

Our analytic treatment allows us to explain the origin of the 
modularity in random graphs. The typical partition of an ER 
graph into modules of size n is very unlikely to have a number 
of within-module links ki larger than the average pn{n—\)/2, 
expected for a random partition of the nodes. However, the 
number of possible partitions S\/{n\r) is so large that, typ- 
ically, there exists a partition whose ki is much larger than 
the average. For example, for a network with S = 200 and 
p — 0.02 one typically finds a partition with r = 7 modules 
and ki « 36, instead of the value ki Ki 8 expected for a ran- 
dom partition. 

Remarkably, the modularity of a random graph can be as 
large as that of a graph with modular structure imposed at the 
onset 0]. In such a graph, nodes are divided into modules 
and each pair of nodes is connected with probability pi if they 
belong to the same module, and with probability po < Pi oth- 
erwise. Using the same example as before, the modularity of 
an ER graph with S = 200 and p = 0.02 is the same as the 
modularity of a graph with m — 7 modules, pi « 0.09, and 
Po « 0.004. 

Scale-free networks — So far, we have considered d- 
dimensional regular lattices and ER random graphs, in which 
all nodes have essentially the same degree. However, many 
complex networks display scale-free degree distributions ]4], 
meaning that some nodes have degrees that are orders of mag- 
nitude larger than the average. Since the results presented for 
ER graphs rely on the fact that there are many partitions of the 
network and implicitly on the fact that nodes are exchange- 
able, it is worth asking whether "random" scale-free networks 
also display modularity. 

To answer this question, we use the scale-free model pro- 
posed in ]'?]. In the model, the network grows by the addition 
of new nodes. Each time a new node is added, it establishes 
m preferential connections to nodes already in the network. 
In Fig. Ha), we show the modularity of scale-free networks 
as a function of the network size S for different values of m. 
As before, we find the modularity by optimizing Eq. Q us- 
ing simulated annealing. As for ER graphs, the modularity 
approaches a finite value for large S and decreases with the 
connectivity m. 

We are unable to derive a general expression for the modu- 
larity of scale-free networks. However, for m — 1 the scale- 
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FIG. 2: Modularity in scale-free networks. Numerical results of the 
modularity as a function of the network size S for different values of 
m. These results are obtained by maximizing the modularity, Eq. Q, 
with simulated annealing. The lines are the predictions of Eq. <2U . 
with a = 0.165 ± 0.009 in all the cases. 

free network is a tree. Thus, 

MsFiS, m - 1) = AhoiS) - 1 - ^ . (20) 

vo 

For larger values of m, we find numerically that, at a fixed 
network size, the modularity is a linear function of 1 /m. The 
simplest possible ansatz for the modularity that verifies this 
condition and Eq. ( I20> simultaneously is 

M..(5,m)^(a+i-^)(l--|). (21) 

As we show in Fig|2] this approximation works well for a — 
0.165 ±0.009. 

Conclusions — We have shown that modularity in networks 
can arise due to a number of mechanisms. We have demon- 
strated that networks embedded in low dimensional spaces 
have high modularity. We have also shown analytically and 
numerically that, surprisingly, random graphs and scale-free 
networks have high modularity due to fluctuations in the es- 
tablishment of links. 

Recently, several works have reported the existence of mod- 
ules in complex networks and suggested that some evolution- 
ary mechanism must enhance modularity. This statement is 
based, in the best of the cases, on the fact that the modularity 
is large enough, and relies implicitly on the assumption that 
random graphs have low modularity. 

Our results enable one to define statistically significant 
modularity in networks. We argue that, just as it is already 
done for the clustering coefficient and other quantities, the 
modularity of complex networks must always be compared 
to the null case of a random graph. The analytical expressions 
we have derived provide a convenient way to carry out such a 
comparison. 
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